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Abstract 

Motivated by various applications, we consider the problem of homogeneous hu¬ 
man population size (N) estimation from Dual-record system (DRS) (equivalently, two- 
sample capture-recapture experiment). The likelihood estimate from the independent 
capture-recapture model Mt is widely used in this context though appropriateness of 
the behavioral dependence model Mtb is unanimously acknowledged. Our primary aim 
is to investigate the use of several relevant pseudo-likelihood methods prohling N, ex¬ 
plicitly for model Mtb- An adjustment over profile likelihood is proposed. Simulation 
studies are carried out to evaluate the performance of the proposed method compared 
with Bayes estimate suggested for general capture-recapture experiment by Lee et al. 
(Statistica Sinica, 2003, vol. 13). We also analyse the effect of possible model mis- 
specihcation, due to the use of model Mt, in terms of efficiency and robustness. Finally 
two real life examples with different characteristics are presented for illustration of the 
methodologies discussed. 

Key words: Adjusted profile likelihood; Behavioral response; Model mis-specification; 
Modified prohle likelihood; Nuisance parameters; Robustness. 


1 Introduction 

The problem of human population size estimation is a very important statistical concern 
which includes a vast area of application in the fields of epidemiology, demography and 
official statistics. Census or civil registration system often fails to extract the true size of the 
population. Usually they conduct another survey independently after the census operation 
to estimate the number of events missed in the census count. This is equivalent with 
capture-recapture principle for the estimation of true size, say N, of the target population. 
Several likelihood models along with associated estimates from capture-recapture technique 
were first addressed by Otis et al. (1978 [21]) for different plausible situations with T{> 2), 
number of independent sources of information. Application of this technique for estimation 
of the number of affected people in an epidemiological study or in a particular event (like 
war, natural calamity, etc.) is also very popular in interdisciplinary platform. In the context 
of human population, more than two sources of information is hardly found for any problem. 
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Different models for population size estimation based on Dual-record system (DRS) have 
been well-sketched by Wolter (1986 [29]). In practice for homogeneous group, model Mt has 
received much attention from both the frequentist and Bayesian statisticians. Mt accounts 
for time(t) variation effect and assumes independence between the sources of information. 
This model was first analysed by Chandrasekar and Deming (1949 [8]) for estimation of vital 
events for a human population. Various frequentist and likelihood approaches are present 
in the capture-recapture literature (see, Bishop et al. (1975 [5]), Huggins (1989 [T6]B. 
Bayesian approach is pioneered by Robert (1967 [23]), Castledine (1981 [3) and Smith 
(1988 [27]; 1991, [28j) and George and Robert (1990 [T3|, Technical Report). George and 
Robert (1992 (HI) first gave an extensive account on the population size estimation through 
hierarchical Bayesian analysis via Gibbs sampling on model Mt. But this common model 
would not be appropriate in most of the situations for human population, especially when 
capture probabilities also vary with behavioral response. At the time of second capture, 
those who are caught in the first sample have a significant difference than those who are 
not captured previously. When both the time (t) variation effect and behavior response 
(b) effect acts together then we will have a more complicated model Mtb, where behavioral 
response effect is modelled by the parameter cj). Particularly, when </> = !, then Mtb reduces 
to Mt- Otis et al. (1978 |2T]) addressed the non-identifiability problem related to this model 
and Ghao et al. (2000 [9]) derived mle following Lloyd's (1994 |T9|) assumption only when 
T > 3. Though the relevancy of the model Mtb is understood in many situations, but due to 
lack of identifiability for DRS i.e. when T = 2, Mtb is seldom used for human population and 
model Mt is widely employed for its simplicity in both demographic and epidemiological 
studies. Hence the issue of model mis-specification is raised. Lee and Ghen (1998 [T7] l 
and Lee et. al. (2003 |I8|) successfully used the subjective Bayesian technique to Mtb 
for T > 3 through Gibbs sampling. Ghatterjee and Mukherjee (2014 [10] 1 discusses some 
issues related to the full Bayes method specifically for DRS and develops some empirical 
Bayes strategies considering the problem of N estimation in a missing data framework. 
In Bayesian paradigm, difficulty may arise as the resulting estimator for N may be very 
sensitive to the choice of prior(s). 

Estimation of population size N from Mtb is the main interest of this article and another 
aim is to study the effect of model mis-specification due to the use of model Mt even when 
(/> is in a small neighbourhood of 1. Here, all the model parameters except N are regarded 
as nuisance parameters. Some useful likelihood-based inference through the construction 
of pseudo-likelihood functions by eliminating the nuisance parameters are discussed in lit¬ 
erature {see Cox, 1975 Basu, 1977 [3|; Berger et ah, 1999 [4j). As per our knowledge, 
profile and adjusted prohle likelihood (Cox and Reid 1987 [l2]) for model Mt has been stud¬ 
ied by Bolfarine et al. (1992, [6]). Recently, Salasar et al. (2014 [23]) analysed integrated 
likelihood approach, another pseudo-likelihood method, with uniform and Jeffrey’s prior 
for eliminating nuisance parameters in Mt- However, in this article, we confine ourselves to 
the profile likelihood and some of its relevant modifications that can summarize the set of 
likelihoods {L(AI, ?/)|x) : ijj € over 'L. The goal of the article is to explicitly investigate 
the potential of these profile likelihood related methods for both the models Mtb Mt in 
DRS context only. We also proposed an adjustment to the profile likelihood for the generic 
model Mtb- In summary, this article is framed to evaluate the extent of inefficiency in the 
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simple estimate Nt and also to provide a non-Bayesian alternative for model 

In the next section, we discuss the models Mt and Mfh in DRS context. Performance 
of the widely used estimate Ni^d from model Mt is analysed in terms of bias and variance 
when independence assumption is violated due to behavioral response change. In section 3, 
the profile and modihed profile likelihood functions are discussed with implementations to 
our interest models. Therefrom, we develop an adjustment to the prohle likelihood for Mtb 
in section 4. Evaluation of the proposed adjusted profile likelihood approach is carried out 
by an extensive simulation study in section 5 and comparison made with Bayes estimate 
sketched by Lee et al. (2003 [IE])- Comparative graphical investigations on the performance 
and robustness of the proposed approach are done against the common estimate Nind- Then, 
illustration of our method is discussed through the application to real datasets and finally 
in section 6, we summarize our findings and provide some comments about the usefulness 
of above profile likelihood based approaches. 

2 Dual Record System: Preliminaries 

Let us consider a given human population U whose size N is to be estimated and any 
attempt to enlist all individuals in U is believed to be incomplete as it fails to capture 
all. To have better estimate of true N, minimum two sources of information covering that 
population is needed. In this paper we will concentrate on those models which have two 
common assumptions - (1) population is closed within the time of two different sources, 
(2) individuals are homogeneous with respect to capture probabilities in both the sources. 
When information is collected from two sources, it is known as Dual-record System (DRS). 
The individuals captured in first source (list 1) are matched with the list of individuals from 
second source (list 2). Classify all the captured individuals in U according to a multinomial 
fashion as in Table 1. The total number of distinct captured individuals by the two lists is 
xq (say), then xq = xio + xqi -|- xh. Clearly, the number of missed individuals xqo by both 
systems is unknown and that makes the total population size N[= x,) unknown. Expected 
proportion or probability associated with each cell are also given and these notations will 
be followed throughout in this paper. Combining all the information estimate of N could 


Table 1: 2 x 2 table for Dual-System Model 


List 2 


List 1 

In 

out 

Total 


1. Observed sample numbers 

In 

Xu 

a^oi 

x.i 

Out 

xio 

3^00 

x.o 

Total 

Xl. 

xo. 

X.. = N 


11. Expected Proportions 

In 

Pii 

Pi. 

P.i 

Out 

P.i 

Poo 

P.o 

Total 

Pi. 

Po. 

1 
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be obtained assuming different conditions on the individual’s capture probabilities leading 
to different models. In this article, we confine ourselves to the models Mt and Mti, which 
are appropriate for homogeneous human population or sub-population. 


2.1 Model Mt 

This model is very simple and widely used for human population. Two additional assump¬ 
tions are required for this model. One is that the two lists are causally independent. An 
individual being included in List 2 is independent of his/her inclusion in List 1. Another is 
time variation in the capture probabilities, i.e., two marginal capture probabilities satisfy 
Pi. ¥" P.i- Then the associated likelihood for N{> xq) is 




N\ 


xii!xoi!xio!(A^ - xo)! 

The corresponding maximum likelihood estimates are 




Nt = xii xoi xio 


XoiXio 


X.l-Xl. 

Xll 


Xll 


Poi,t 


Xu , . Xll 

- and pio,t = -■ 

x.i Xl. 


This estimator is well-known as DSE or C-D estimator in the literature of census coverage 
error estimation and it also popular as Lincoln-Petersen estimator in wildlife population 
study. We denote this estimator as Nind throughout this paper. 


2.2 Model Mtb 

Causal independence assumption is criticised in surveys and censuses of human populations. 
An individual who is captured in first attempt may have more (or less) chance to include 
in the second list than the individual who has not been captured in first attempt. This 
change in behavior may occur due to different causes {see Wolter 1986 [29]) and it is grossly 
known as behavioral response variation. When this chance is more then the corresponding 
individuals are treated as recapture prone, otherwise when it is less, the individuals become 
recapture averse. When this feature is combined with the time variation assumption, one 
will get the relatively complex model Mtb. To model this situation one has to impose the 
assumption following Wolter (1986 |29j l that the probability of first capture is the same for 
each individual in the population and that is 

Prob(ith individual is captured in List 1) = pi.. 

Prob(fth individual is captured in List 2 | not captured in List 1) = poi/po. = P 

and the probability of recapture or Prob(ith individual is captured in List 2 | he/she is 
captured in List 1)= pu/pi. = c. But this model has some unidentifiability issue as the 
corresponding likelihood function 


Ltb{N,pi,,p,c) (X 


Nl 


{N - xo) 
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for N > xq, consists lesser number of sufficient statistics (xn, xqi, xio) than the parameters 
{N,pi,,p,c) {see Otis et al. 1978 [21]). A popular assumption that recapture probability 
at second sample, c, is equal to a constant multiple of the probability of first time capture 
in second attempt, p. Hence, c = cpp and Chao et al. (2000 [9|) adopted this from Lloyd 
(1994 [T9j) to get rid of from the problem . Then likelihood becomes 


N' 

Lth{N,p^„p,<j,) oc - #r° (2) 

where 4>, the behavioral response effect, is orthogonal to N. Lloyd’s assumption is helpful 
when number of sources is strictly more than two. But it is noticed that identifiability 
problem persists in DRS. Both of 4> and p are not identifiable separately but their product 
c is rather identifiable. Thus, likelihood Q is more ill-behaved than (Q. Replacing p with 
c/(j) in (1) one might have another parametrization where (p is not at all orthogonal to N. 


2.3 Model Mis-specification 

In the context of several real life applications on homogeneous human population or sub¬ 
populations, estimator Nind derived from model Mt is often used though appropriateness of 
model Mtb is well-understood. Hence, a threat of model mis-specification naturally arises 
if Nind is used. In this section we investigate how serious that threat could be. At first, 
consider the following lemma {see Raj, 1977 [22ji. 


Lemma 1. Suppose x,y and z are three random variables with finite moments upto second 
order. Then, large sample approximation to the mean of ^ is 


E 


/xy\ 

^ E{x)E{y) / 

\z) 

E{z) \ 


1 + 


C{x,y) 

E{x)E{y) 


C{x, z) 
E{x)E{z) 


C{y,z) V{z)\ 
E{y)E{z)^ E‘^{z)) 


Replacing x, y and z by xi., x,i and xn respectively in the above lemma, we obtain the 
bias stated in the next theorem. Variance is also computed using same lemma with suitable 
replacement. 


Theorem 1. Suppose, the actual underlying model is Mfb with parametrization (N,pi,,p, <p). 
Then, second order large sample approximation to the bias and variances of Nind = 
for estimating N are 


Bias ( Nind ) 

V ^ Mtb 

Var (Ninti) 

V J Mtb 


, 1(1 

N{l-pi)—^ + - — 

4> 4> 


(1 - (t^p) 

4> pi.fip 


Pi,){i - pp) 

pi.Pp 


Clearly, when p increases above one, second part of the right hand side in bias gradually 
boils down to 0 as pi, and pp = c are expected to be more than 0.5. Hence, simple estimate 
Nind underestimates N and its bias —)• —V(l—pi.) as p (> 1) increases. Similarly, when p 
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(< 1) decreases to 0, Nind increasingly overestimates N. Thus, assumption oi 4> = 1 might 
happen to be very risky and use of Nind may lead to an inefficient estimate. On the other 
hand, if (f) is exactly 1 (i.e. list-independence case), bias reduces to as p = p,i 

under independence. Therefore, bias will be negligible when pi, and p,i both are large. The 
result also tells that s.e.{Nind) is proportional to under Mtb- Even when, 0 = 1, 

then 

s.e. = AfV2 

Our discussion on pseudo-likelihood methods in next two sections is based on both the 
models Mtb and Mt, since, model Mt is often used in practice and Mtb = Mt only when 
0 = 1 . 

3 Some Pseudo-likelihood Methods 

Let us consider a statistical model with likelihood function L(A|x) with A = (0,0), where 0 
is parameter of interest and 0 represents nuisance parameter, both may be vector valued. 
Presence of more nuisance parameters in the model affects the comparative inferential study 
based on the likelihood (see Basu (1977 [3]), Severini (2000 [26])). Now our aim is to find 
a function that can summarize the set of likelihoods C* = {L(0,0|x) : 0 G T} over T. 
That summarized function is denoted as L*{6) which is treated some what like a likelihood 
function of 0; as if the inference frame has 0 as the only parameter. We refer such functions 
L*{9) here as pseudo likelihood function of 0. This kind of pseudo likelihood functions 
includes profile likelihood function. Modified profile likelihoods (Barndorff-Nielsen, 1983 
[I| and 1985 [2]) and adjusted profile likelihoods (Cox and Reid, 1987 [T2|) are basically 
modifications to the profile likelihood function. There are several other kind of pseudo 
likelihood functions in the literature, such as marginal, conditional, partial (Cox, 1975 |llj ) 
and integrated likelihood (Berger et al., 1999 [3j) functions. In the present context, interest 
is basically on N and sometimes also on 0 in Mtb- We restrict ourselves to the profile 
likelihood functions obtained by summarising the original data likelihood over the domain 
of nuisance parameter and some of its suitable modifications. Moreover, we propose an 
adjustment over profile likelihood which is driven by an adjustment coefficient so that the 
resulting likelihood estimate satisfies some desirable frequentist properties. 

3.1 Profile Likelihood (PL) Method 

This approach summarizes £* at 0 = 00, the conditional mle of 0 for given 0. Hence, 
the profile likelihood (PL) for 0 is L^{0) = L(0,00|x). Hence, inference about 0 is made 
by maximizing L^{9) (or logL^{6)) considering as a likelihood function (or log-likelihood 
function) of 0. But, in general, it is not a proper likelihood function. Thus, inferences based 
on this assumption may be misleading, specifically when 0 is high-dimensional. 

In the context of independent model, Mt, PL for interest parameter N is given by 
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for N > max{xi,,xi,,xo) = xq. Here, as elsewhere in the paper, multiplicative terms not 
depending on N in likelihood function of N have been ignored. 

Theorem 2. Lf (N) is increasing in N for N < (xi.x.i/xn) — 1 and hence, when (xi.x.i/xn) 
is an integer, the corresponding mle Nf is (xi.x.i/xn) — 1. When (xi.x.i/xn) is not an 
integer, Nf is either [xi.x.i/xn] — 1 or [xi.x.i/xn], according to which produces the max¬ 
imum value of the profile likelihood, where [n] denotes the greatest integer not greater than 
u, for u G M. 

Nf is finite iff xn > 0. Maximum profile likelihood (PL) estimate can also be ob¬ 
tained by maximising Lf (N) assuming iV as a real number and using the formula for 
digamma function of any positive integer z (obtained from recursion relation), P(z) = 
{d/dz)log{T{z)) = — 7 -|- S^z|(l/a), where 7 is the Euler-Mascheroni constant. 

For any parametrization of model M^b, such as 0 or 0 , the PL for N reduces to 

for N > Xq, as PL is parametrization invariant. Clearly Lf^{N) is decreasing for > xq as 
< (1-f)"““'- It can be written that Lf,(iV) = (1 _ ^)^-o < 

(1 — Now as (1 — ^ i IV, Lfj^{N) is a decreasing function in for > xq. 

Hence, mle will be the lower bound of N i.e. = (xq -|- 1). It is clear that this pseudo- 
likelihood is not useful, as it stands, for estimating the population size N. 

3.2 Modified Profile Likelihood (MPL) and Its Approximation (AMPL) 

Since marginal and conditional likelihoods are not available for M^b, the idea is to use a 
suitable modification to the profile likelihood. Several such modifications are suggested 
in the literature. PL cannot approximate a marginal or conditional likelihood function 
and that leads to poor performance. We now discuss a modification to the profile likelihood 
function. In general, modified profile likelihood (MPL) proposed by Barndorff-Nielsen (1983 
[I], 1985 [ 2 ]) is written as 


(3) 

where D{0) = Ik® inverse of Jacobian J{0) = d^/d'tpg oc d'lf/d'fe and is 

the observed Fisher information of ip for fixed 9. The actual derivation of L^^(9) as an 
approximation to a conditional likelihood is sketched in Severini (2000 [26]) considering 
as sufficient with 9 held fixed and a is ancillary statistic. However, we can simply 
express the partial derivative factor in L^^{9) as follows: 
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Let us denote the logarithm of likelihood L(-) as Then conditional mle 'ipg implies 
~ sufficient statistics may be written as {9,'ll), a), a being ancillary. 
Then, by differentiating with respect to V' we have 


= 0 . 

This implies ^ AeAAe), where V’e) = -AA^Ae)- Hence, MPL 

in Q may also be written in the following form 




and hence in (4), D{0) = '06i)|/K^.^(0, V’e)| according to the form in (3). 


( 4 ) 


There is an approximation to suggested by Severini (1998 [25]) in which D{9) is taken 

as \jpjA(^Ae)\/\I{9Ae](^A)\^ where Fisher’s information/( 6 », V'; 6 * 0 , V’o) = (9/5V'o)-E'{^V’(^> V’)l^o, "^o} 
is an approximation to as .F{£^(6', V’o} = A\GoAo)+0{l) and A-A) = 

{d/dA)Ai^A\(^oAo)■ Hence, approximated modified profile likelihood (AMPL) is 


L^P{e) = \i{e,A-,e,AA\hA(^Ae)A^LAo). (5) 


Remark 1. Clearly, L^^{9) = L^^{9) if and only if \1 ^,AGAb)\ = \I{^Ae]GA)\j ignor¬ 
ing the terms not depending on 9. 

Imple'mentation to -models Mt and Mtb-' 

The following result shows that MPL and AMPL are identical on the domain N > xo for 
model Mt- Severini (1998 [25]) stated this result only. However, the explicit proof is given 
in Appendix. 

Result 1. Both and are same for model Mt with 9 = N, if = {pi.,p.i) o,nd for 

N > xq, it is given by 


N\ 

{N-xo)l 


[N - - a;^^)(^-^.i+V2)^-(2iV+i) 


= lAn){N- xi.A^iN- x.iA^N-\ 


LAAN)=LfAN\A 









An interesting relation between PL and MPL for the model Mt is formulated in the next 
theorem. Theorem shows that MPL estimate is same as ordinary likelihood estimate of 
N. Proofs of the following two theorems are also in Appendix. 

Theorem 3. The maximum profile likelihood estimator, Nf, is no greater than the maxi¬ 
mum modified profile likelihood estimator . 

Theorem 4. L^^{N) is increasing in N for N < (xi.x.i/xn) — 1 and hence, the corre¬ 
sponding mle, is [xi.x.i/xn] if {xix,i/xii) is not an integer; and is (xi.x.i/xn) — 1, 

if {xix,i/xii) is an integer. 


Thus, for (xi.x.i/xii) G Z+, the set of positive integers, iV/’ = = 

(xi.x.i/xii) — 1 and for (xi.x.i/xn) not G Z+, = [xi.x.i/xn] > iV/’. 

Next, we present the computation of MPL and AMPL in the context of model Mu,. Let 
us consider the parametrization 9 = N, = {pi.,pIq,c). Hence, by differentiating the log- 

likelihood (from (1)) with respect to we have k^^{9, tp) = ^ ^ ^ ^ 


pi. 


i-pi. ’ pio i-p*o ■ 


l-c 


Therefore,00, V’o} 


Oo=d,'ipo=i’ 


Npoi _ N-Npoi NpIq{ 1 -poi) _ N-Nplf,{l-poi)-Npoi Ncppi _ jV(l-c)poi 
Pi. 1-Pi. ’ Pto 1-Pio ’ 1-'= 

Hence, fie] 0,fi)\ oc N‘^{N — xi.)/(A^ — xq), since 


fi\e,fi-,9,fi) = 


-^E\ 

dfi 

( ^ + 

pi. 

V 


i%{9,'fi 

N 

Pi. ' 1-pi. 

0 


and fie = -M 


0 


Xl. Xoi Xii 


J 8o=d,ipo=ip 



Np*,a Np;„-N 

Nc 

JV(l-c) 

Pio .. l“Pio 

poi) -N{l-poi)-Npoi 

c 

l-c 

0 

a 

I 

o 


0 

Npol 

c 

I Apol 
l-c 


iV’iV-xi,’ xi. 


di>0 


dip 


Again, we have |^^.^(0, V’e)! = \jipip{0,fie)\ 
is found as N^{N — xi,fi{N — xo)“^. Hence, 

d 


and 


dip0 

dip 


= N ^(iV-xi.) ^ and I(0,-00)I 
?^^(0,'0e)| oc N‘^{N — xi)/{N — xo), where 


eo=8,ipo=ip,ip=ip0 


= ^£;?(0,0|0o,0o) 

, 00)1 = |/*''( 0 , 00 ; 0 ,' 0 )|, ignoring tl 
Therefore, from and toh, L^^[N) = L^^{N) and hence, the following result. 


So, we have |t’^^(0,00)| = |/*^(0,-00; 0,-0)1, ignoring the terms not depending on 0 = A". 


Result 2. For the model Mu with 6 = N, fi = {pi.,pIq,c), both of and is 

equivalent to 
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LtbiN) = . ' ■ {N - x^)iN-xo+l/2)^-{N+l/2) 

[N - xoj! 

= Lg{N)il - xo/N)^/\ for N > xq. 

Now, {d/dN)£^^{N) = {d/dN)l^^{N) + 2 {n^-xo) ~ 2 N- Using the asymptotic approxi¬ 
mation of gamma function, log{T{z -|- 1)) = z{log{z) — 1} -|- log{z)/2 + log{27r)/2 + 0(z~^), 
we have {d/dN)£ff^{N) = ^ — 2 {n^-xo) + 0{N~^) = 0{—N~^) < 0 for N > xq. Therefore, 
{d/dN)if^^{N) = {d/dN)iff^{N) + 2N{^-xo) ~ 0{N~^) > 0 for > xq. Hence clearly, 
also does not give any finite maximum likelihood estimate. 


So far we have understood that M^b is the most suitable underlying model that a homoge¬ 
neous capture-recapture system must follow and also the failure of this model even in case 
of modified and approximate modihed profile likelihoods. That may lead the practitioners 
to use the model Mt (assuming list-independence) whose mle and other profile likelihoods 
exist. Here, in this paper, we try to address how much efficiency we are loosing by the use 
of Nind if list-independence does not hold. The possible threat of model mis-specification 
due to the use of Mt is discussed in section 2^ In the next section, we propose a suitable 
adjustment to the profile likelihood function for model Mtb and discuss the conditions under 
which the associated estimate of N can exist. The adjustment is so designed as to preserve 
better frequentist and robust properties than Nind even in a small neighbourhood around 
1 . 


4 Inference Based on An Adjustment to Profile Likelihood 
(AdPL) 

4.1 Proposed Methodology for Mtb and Related Properties 


Understanding the failure of PL and its two modifications - MPL and AMPL, for Mtb here 
we propose an adjusted version of the profile likelihood. Our proposed adjusted profile 
likelihood (AdPL) for generic model Mtb with adjustment coefficient d (g TZ) is 




df>e 

dif 


-<5 


( 6 ) 


Note that in particular, when cf = 1, Mtb Mt and therefore, L^^{9) will be same as 
L^^{6) in ^ iff the adjustment coefficient 6 is fixed at 1. That means, for model Mt, our 
proposed AdPL reduces to the MPL, given in if <5 = 1. 


In the context of model Mtb with parametrization (1), 


we have the following result using (|6|) and 


dfe 


dip 


N ^{N — xi,) Hence 


Result 3. For model Mtb with 6 = N, = {pi.,p\q,c), the adjusted profile likelihood for 

all N > xo, according to is given by 
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Lfb(A^)iV2('5-i)(l - xi./nY-\ 1 - xo/N)^/^ 

- xi./N)^-^ 

__^|_^5_7V_3/2(^ - Xi.)'^-^(iV - 


Now the following theorem justifies the condition on the domain of 5 in order to have a 
finite maxima for the adjusted profile likelihood for Mtb- Proof is in the Appendix. 

Theorem 5. (a) Finite maximum adjusted profile likelihood estimate of N exists for the 
model Mtb only if 6 < 1. 

(b) For the model Mtb, 3 some 6o < 1 3 \/6 < do, L^^(N) f N and hence, corresponding 
mle of N tend to the lower bound (xq + Ij - 


Hence, a choice of d, either very small or greater than 1, would lead us to trivial results. 
Now we try to find a suitable d (between (5o and 1), rather a class of suitable d, in order to 
obtain a reasonable estimate of N. Considering N as real, we found the first derivative of 
adjusted profile log-likelihood as {d/dN)£tb{A[) = {d— l)/N + {d— 1)/{N — xi,) + An, where 
sequence is positive and equivalent to 0{N~‘^) for fixed data since digamma function 
/3{N) = 0{N~^). Equating this to zero we have, (1 — d)0{N~^) = An and this implies 
d = 1 — Bn, where Bn is positive sequence of N and equivalent to 0{N~^). In practice, 
one can choose a <5 such that (5 = 1 — Op{N~^). 

Remark 2. If we apply the proposed adjustment to the profile likelihood function associated 
with model Mt, then Lf^{N) can be expressed as 

Lf^{N) = Lf ^(iV)iV2(<5-i)^ for all N > xq. 

For the model Mt, analogous to theoremwe have the following observations: 

Remark 3. (a) there exists some do < 1 3 ^d < do, L^^{N) N and hence, corresponding 
mle of N tend to the lower bound xq, 

(b) 3 some 5' > 1 9 V(5 > d', does not have finite estimates. 


4.2 


Variance of 


It is found in section 


2.3 


that s.e.{Nt) is when independence holds. Hence, to 

study the nature of variability in can we postulate that s.e.{N^^) = 0{N‘^), for 

some a > 0? To investigate this and if so, to get some idea on the extent of a, we take the 
following example. Finally, a comparison of the pattern of variability in against Nt 

under the underlying model Mtb is made graphically. 


Example: Let us consider four artificial populations Sl(pi. = 0.60, p.i = 0.70), S2(pi. = 
0.70, p,i = 0.55), S3(pi, = 0.60, p,i = 0.70) and S4(pi. = 0.70, p,i = 0.55) following model 
Mtb- From each population, we generate 200 data sets (xn, x.i, xi.) and obtain for each 
data. These 200 estimates constitutes the sampling distributions of the estimator. Finally, 
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SI 


S2 




S3 


S4 




Figure 1: Comparative plots of loge{s.d.{N)} for both estimates (dotted line) and Nt 

(continuons line) over several trne loge{N) are plotted for the artificially simulated popula¬ 
tions using capture probabilities mentioned in SI, S2, S3 and S4. 


s.d. over 200 replicates is calculated to measure s.e. of the estimate. Same calculations 
are also done for the estimator Nind = {xix,i/xii) and finally, comparative behavio ur o 


4.2 


the ln{s.e.) of both the estimators Nind and are plotted against ln{N) in Figure 
Figure shows that s.e.{N^^) is less than s.e.{Nt) ^N and values of estimated a in s.e.{N^^) 
are between 0.25 and 0.30 for all the populations. Thus, the numerical investigations carried 
out above suggests that the proposed adjusted profile likelihood could be more helpful in the 
context of population size estimation (under the model Mj^) and it shows better efficiency 
than the usual DSE estimator Nind in terms of s.e. 
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5 Numerical Illustrations 

5.1 Simulation Study 1 

In this section we have considered various artificial populations, reflecting different possible 
situations under Mtt, to illustrate the behaviour of the competitive estimators in DRS 
discussed in earlier sections under the model Mtb. In any kind of time ordered samples, the 
possible list-dependence can be modelled through First, we simulated four populations 
for each behavioral dependence situation (</> = 0.80 and 4> = 1.25 respectively represents the 
recapture averseness and recapture proneness) that encompasses all possible combinations. 
Capture probabilities for those populations, each having size N = 500, are structurally 
presented in Table The expected number of distinct captured individuals {E{xo) = 
N{pn +Poi +Pio)) for each population is cited in TableIt is noted that in the first two 


Table 2: Populations with N = 500 considered for simulations study 


Population 

4> 

Pi. 

P.i 

E{xo) 

Population 

(!> 

Pi. 

P.I 

E{xo) 

PI 

1.25 

0.50 

0.65 

394 

P5 

0.80 

0.50 

0.65 

430 

P2 

1.25 

0.60 

0.70 

422 

P6 

0.80 

0.60 

0.70 

459 

P3 

1.25 

0.80 

0.70 

458 

P7 

0.80 

0.80 

0.70 

483 

P4 

1.25 

0.70 

0.55 

420 

P8 

0.80 

0.70 

0.55 

446 


populations for each (j), Pi. < P.i which refers to the usual situation in DRS data obtained by 
a specialised survey conducted after a large census operation, e.g. Post Enumeration Survey 
(PES). The last two populations with pi, > p.i are just the opposite case which is observed 
often in a study of the estimation of drug users. It is also noted that P2, P4, P6 and P8 are 
same as hypothetical populations SI, S2, S3 and S4 respectively, considered for illustration 
of the variance of proposed estimates in section 4.2 Now, 200 data sets (xn,x.i,xi.) 


are generated from each of the above eight populations. We present the adjusted profile 
likelihood estimate (AdPL) for each situations for different reasonable 6 values. To compare 
the performance of our proposed method with Bayesian strategy, we compute the estimates 
by Lee et al. (2003 [IE]). In addition, the estimates assuming list-independence, Nind, 
are also shown to empirically understand the extent of bias due to model mis-speciflcation 
discussed in section Eor each estimate, several other frequentist measures are shown to 
evaluate the relative performance of the said estimators. Final estimates of N is obtained 
by averaging over 200 replications. Based on those 200 estimates, the sample s.e., sample 
RMSE (Root Mean Square Error) and 95% bootstrap confidence interval (C.I.) are also 
presented in Table (for cj) = 1.25 representing recapture-prone situations) and Table 
(for (j) = 0.80 representing recapture-averse situations). For Lee’s Bayes estimates, 95% 
credible interval (C.I.) based on sample quantile of the marginal posterior distribution of 
N is presented. 


Tablej^says that as (5(< 1) is chosen to be closer to 1, AdPL performs better for case of low 
capture probabilities (PI &: P4). In other situations (P2 & P3) where capture probabilities 
are high, efficient adjustment coefficient <5 will be (1 — 1.25A^“^). In other words, we try to 
analyse the performance from the perspective of two kinds of populations where xi. < x.i 
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Table 3: Summary results for populations P1-P4 (representing recapture-prone situations) 


when No directional information on </> is available. 


Method 



PI 

P2 

P3 

P4 

^ind 


iV(s.e.) 

RMSE 

C.I. 

450(14.10) 

51.54 

(425,480) 

460(11.23) 

41.32 

(438,481) 

480(7.07) 

20.55 

(465,493) 

469(12.01) 

32.55 

(444,491) 

Lec|^ 


iV(s.e.) 

RMSE 

C.I. 

468(20.56) 

37.94 

(398,561) 

483(18.45) 
24.97 
(426,560) 

485(6.61) 

16.97 

(460,513) 

471(8.11) 

30.61 

(422,542) 

AdPl 

6 = 1- 0.75iV-i 

lV(s.e.) 

RMSE 

C.I. 

486(12.15) 

18.86 

(461,507) 

513(10.61) 

17.01 

(491,532) 

539(7.15) 

39.82 

(525,552) 

499(9.74) 

9.61 

(578,516) 


6 = 1- 1.25iV-i 

lV(s.e.) 

RMSE 

C.I. 

461(11.47) 

40.32 

(439,480) 

488(10.01) 
15.54 
(467,506) 

515(6.78) 

16.32 

(501,527) 

476(9.27) 

25.68 

(456,493) 


6 = 1- 1.75iV-i 

lV(s.e.) 

RMSE 

C.I. 

449(11.13) 

51.64 

(428,469) 

476(9.77) 

25.85 

(455,493) 

504(6.60) 

7.71 

(491,516) 

466(9.02) 

35.23 

(446,482) 


“with prior n((j)) = U(0.5,2). [Chatterjee and Mukherjee, p.p. 14 (2014 [10)')] 


and xi. > x.i- For both kind of situations xi. < x,i {i.e. PI & P2) and xi, >io {i-e. P3 &: 
P4), AdPL performs progressively better as 6{< 1) is chosen to be closer to 1. Except P3, 
AdPL shows more efficient result than Lee’s method. In any recapture prone situation, the 
use of Nind will certainly mislead us, particularly for the cases where capture probabilities 
are low and/or when underlying (j) is far above 1. 

Similarly, when we turn to analyse some considered hypothetical populations with re¬ 
capture averseness, we see from Table that as S is chosen to be relatively smaller at 
(1 — 1.75iV“^), AdPL performs reasonably better. In low capture situations (P5 and P8), 
AdPL shows more efficient result than Lee’s method. Table also shows that in any recap¬ 
ture averse situations, Nind will highly overestimate as i/) is substantially different from 
1 . 


Hence, in both situations of recapture aversion and proneness, poor performance of Nind 
becomes worse particularly for the populations where xi. < xj. Lee’s Bayes estimate, with 
prior = [7(0.5, 2), generally underestimates for cj) > 1 and overestimates for (p < 1 
but use of their estimate is recommended than that of Nind to avoid serious model mis- 
specification. However, we found that our proposed adjusted profile likelihood method, with 
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Table 4: Summary results for populations P5-P8 (representing recapture-averse situations) 


when No directional information on </> is available. 


Method 



P5 

P6 

P7 

P8 

^ind 


iV(s.e.) 

RMSE 

C.I. 

563(23.15) 

67.21 

(523,615) 

550(14.94) 
52.48 
(524,578) 

526(8.08) 

27.09 

(510,541) 

538(14.26) 

40.44 

(513,565) 

Lec|^ 


iV(s.e.) 

RMSE 

C.I. 

474(20.80) 

35.58 

(431,566) 

512(15.76) 

19.83 

(461,575) 

516(6.17) 
18.71 
(486,553) 

517(13.02) 

21.75 

(451,615) 

AdPl 

6 = 1- 0.75iV-i 

lV(s.e.) 

RMSE 

C.I. 

533(9.53) 

34.57 

(513,552) 

562(7.44) 
63.05 
(547, 577) 

574(5.70) 
74.25 
(563,584) 

536(8.15) 

36.88 

(521,551) 


6 = 1- 1.25iV-i 

lV(s.e.) 

RMSE 

C.I. 

505(9.40) 
10.72 
(487,524) 

534(6.98) 

35.23 

(519,547) 

548(5.21) 
48.40 
(537, 557) 

510(7.75) 

13.01 

(497,525) 


6 = 1- 1.75iV-i 

lV(s.e.) 

RMSE 

C.I. 

492(9.18) 

12.45 

(474,510) 

521(6.75) 
22.04 
(506,534) 

535(5.00) 

35.88 

(525,545) 

499(7.52) 

9.65 

(485,512) 


“with prior 7r(<^) = U(0.5,2). [Chatterjee and Mukherjee (2014, p.p. 18 [10)')] 


suitably chosen value of 6, can perform better than Lee’s. 


5.2 Simulation Study 2 

Here we examine some frequentist as well as robustness properties of the adjusted profile- 
likelihood estimate along with the simple estimate Nind ={xix,i/xii). 


Frequentist Coverage Performance: 


Firstly, under the mis-specification threat (see section 2.3), we graphically study the cover¬ 


age performance of N^nd ={xix,i/xii) for true N as N varies. Moreover to compare with 
the , we also do same for our proposed AdPL estimator. We consider all the artificial 
populations (following simulated earlier in section 5.1 For moderately large popula¬ 
tion (say, N > 100), we found both the Nind and N^^ to be approximately normal. Figure 
[^andj^show simultaneous plot of the 95% relative UCL (=(iV-|- 1.96s.e.(iV))/A') and LCL 
(=(iV — 1.96s.e.(iV))/A') corresponding to the estimators Nind and N^^ over several true 
N. The motivation behind this unorthodox type of figures is as follows. The Relative LCL 
and relative UCL contains 1 with 0.95 probability. Hence, we can compare how much the 
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Figure 2: Comparative plots of confidence bands of N/N corresponding to both the esti¬ 
mates (dotted line) and Nt (continuous line) are plotted against different true N for 

populations P1-P4 (recapture-prone cases). The targeted value of N/N is indicated at 1.0 
(presenting unbiasedness). 


relative confidence limits for the said estimators deviate from 1 with gradually increasing 
true N (here, it ranges from 100 to 1000). For the recapture prone {4> > 1) cases, Figure 
shows that relative confidence bounds of are slightly tighter as well as closer to 1 in 
most of the situations compared to Nind- Analogously, Figure]^ for the recapture aversion 
{(j) < 1) cases, shows that conhdence bounds of N^f, are tighter than that of Nind as N 
increases and it is relatively closer to 1 in all situations for different N values. 

Robustness Consideration: 

Our other interest is on the robustness of the proposed estimator and the usual C-D estima¬ 
tor Nind- Actually the model Mfi, is driven by the unidentifiable behavioral effect parameter 
(j). An useful estimator for N should be as robust as possible with respect to the underly- 
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Figure 3: Comparative plots of confidence bands of N/N corresponding to both the esti¬ 
mates (dotted line) and Nt (continuous line) are plotted against different true N for 

populations P5-P8 (recapture-averse cases). The targeted value of N/N is indicated at 1.0 
(presenting unbiasedness). 
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Figure 4: Comparative plots of confidence bands of N/N corresponding to both the esti¬ 
mates (dotted line) and Nt (continuous line) are plotted against different (p for four 

situations. The targeted value of N/N is indicated at 1.0 (presenting unbiasedness). 


ing (p value and hence, in Figure]^ we present a comparative study on both the estimates 
against different p. We fix true N at 500 and p is considered to vary between 0.5 and 3.0. 
In simulation 1, four artificial situations are assumed without considering the cj) value. Here 
we have studied the robustness for all those four situations. Figure 1^ depicts that has 
better robustness w.r.t. p than Nind in all situations. 

5.3 Real Data Illustrations 
5.3.1 Example 1 

An example of DRS data is considered on death count obtained from a Population Change 
Survey conducted by the National Statistical Office in Malawi between 1970 and 1972 (for 
details, see Greenfield (1975 [E]). Only two strata, called Lilongwe (c = 0.593, x,i > xi) 
and Other urban areas (c = 0.839, x.i < xi.), are selected to illustrate the role of different 


18 












c values and opposite nature of x,i and xi.. Significantly lower c value indicates that the 
people of Lilongwe seemed to be less likely to give the information on deaths again in survey 
time than that of Other urban areas people. 

Now, if anyone wishes to use the widely acceptable model Mt assuming list-independence 
and calculate the simple estimate Nind-, he/She would find that 365 and 2920 deaths oc¬ 
curred in Lilongwe and Other urban areas respectively. Nour (1982 [20]) argued that the 
assumption of independent collection procedures is unacceptable in reality. Assuming the 
fact that two data sources are positively correlated (i.e. 4> > 1) in a human demographic 
study, they estimated death sizes as 378 (i.e. (j) = 1.33) and 3046 (i.e. (j) = 1.13) for Li¬ 
longwe and Other urban areas respectively. However, in this article we do not make any such 
assumptions on the directional nature of (j)- We consider the data as just an 2 x 2 DRS data 
where nothing is known about (p. Then, Lee et al.’s fully Bayes method with uniform prior 
7r(</>) = 17(0.1,2) finds that 372(0 = 1.19) and 3205(0 = 1.30) deaths occured in Lilongwe 
and Other urban areas respectively. Our adjusted profile likelihood method estimates the 
death sizes as 378(0 = 1.33) and 3428(0 = 1.53) respectively, taking 5 = 1 — 4(1 — c)N~^. 
Our estimates agree with Nour’s for Lilongwe but Nour’s estimate for Other urban areas is 
significantly smaller than Lee’s estimate as well as our estimate. 

5.3.2 Example 2 

Another example of DRS data is considered on injection drug user (IDU) of greater Victo¬ 
ria, British Columbia, Canada (Xu et ah, 2014 (SOj). To track the changes in the prevalence 
of HIV and hepatitis C, the Public Health Agency of Canada developed the national, cross- 
sectional I-Track survey. Phase I and phase H of the I-Track survey were completed in 
Victoria in 2003 and 2005, respectively. With only two samples from the I-Track survey 
(phase I and phase H), some closed population mark-recapture models were implemented 
to estimate the number of IDUs in greater Victoria, BC. They found that Lincoln-Petersen 
(LP) estimate, Ni^d from model Mt, for the total number of injection drug users was 3329. 
They also commented that LP estimator might not be worthwhile if independent assump¬ 
tion was violated when behaviour response and/or heterogeneity affects the probability of 
capture. They use Huggins’ (1989 [TB]) conditional likelihood approach to deal with plau¬ 
sible heterogeneity in the data and estimate was 3342. Moreover, the time ordering of 
samples offers an opportunity to use model Mtb- Literature on epidemiological studies on 
such type of hidden or hard to reach population says that individual, who are listed in first 
survey, tries to avoid the listing operation in second survey. Thus there is high possibility 
of reeapture-aversion (i.e. 0 < 1). Low recapture rate is (c = 0.075), which strengthens this 
possibility. 

Considering the DRS data originated from model Mtb with 0 > 0, Lee et al.’s fully Bayes 
method with prior 7r(0) = C/(0.01,2) finds that 596(0 = 0.11) number of drug users are in 
that population. As c is found very low, our adjusted profile likelihood method estimates 
the size of injection drug users as 584 (0 = 0.09) taking 6=1 — 4(1 — c)N~^. Hence, Lee’s 
method and our adjusted profile likelihood method says that if you consider the population 
as quite homogeneous then most general model Mtb suggests that total number of injection 
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drug user of greater Victoria is around 580 to 600, a much lower estimate than the estimate 
of drug users under independence. 


6 Summary and Conclusions 

In the context of population size {N) estimation, inappropriateness of model Mt is advocated 
for Dual-record system (DRS) in several real life situations. But at present this model is 
widely employed specially in census undercount estimation and epidemiology due to its 
simplicity. We have considered the most general model Mtb that allows the behaviour 
response effect to play a significant role along with time variation effect in estimating N. 
The model Mtb suffers from identifiability problem where suitable Bayesian methods might 
have the potential to overcome that burden. However, in this article we have investigated 
the usefulness of pseudo likelihood approaches based on profiling the interest parameter 
N. Ordinary profile, modified profile and approximated modified profile likelihoods have 
been shown to be useless for model Mtb- An adjustment on profile likelihood (AdPL) is 
proposed tuned by an adjustment coefficient so that reasonably better solution can be made 
available. The present article also shows mathematical and graphical analyses of possible 
model mis-specification due to the use of Mt- 

The proposed method depends on the choice of S (close to 1 — N~^) using the knowledge 
of c and possible direction of cj). In real life situations, if (j) is unknown, then uniform choice 
is possible. Lee et al. (2003 [l8]) Bayes method provides better coverage than any other 
method but also it possesses lower efficiency in most situations than AdPL. Moreover, Lee’s 
method, with trial-and-error approach to discover a suitable range for uniform prior 7r{4>), 
may take a long time. Some other disadvantages are subjectiveness of the informative prior 
Tr{(f>), highly dispersed conditional posterior of cj), etc. Thus, our proposed adjusted method 
is useful to obtain an efficient estimate of population size (N) very quickly from this complex 
DRS. In addition to that, AdPL helps to produce more efficient alternatives specially in 
recapture prone situations. 

Appendix 

Proof of Theorem 

At first we shall derive the Bias of (xi.x.i/xn) in terms of original DRS probabilities in 
Table In multinomial setup, we have E{xab) = Epab,Cov{xab,Xcd) = —NpabPcd, for 
a,b,c,d G {1,2}. Then replacing x, y and z by xi,, x,i and xn respectively in the above 
Lemmal, we have 
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Hence, Bias(xi x.i/xii)=S(xi x.i/xn) - N = -N{1 - po) + A^(poiPioMi) + (PoiPw/Pii)- 
Now, in Mtb, c = (pp = pu/pi. and p = poi/(l — Pi.)- Hence, after some algebraic simplifi¬ 
cation, we found Bias(xi.a:.i/a^ii)=-^^(l — Pi.)(l — 4’)/4>+ ■ D 


Proof of Theorem ^ 

Atfirst define Rf (N) = Lf [N + 1 )/Lf (N) and after some algebraic simplification we have, 

T?P( ]\r\ _ (W—3:i,+ 1)(A''—3;,1-|-1) / w \2N (-i , 1 \N—x^ f-i i 1 \N—x i IVrrar + 

V'*) — (Ar-xo-l-l)(Af-|-l) lAf-l-ll +Ar-x.i 1 -iNOW, ^ 

> 1 ^ ^ < (xi.x.i/xii) - 1. 

Therefore, Rf {N) > 1 for all N < (xi.x.i/xn) — 1. Hence, corresponding mle Nf is 
(xi.x.i/xii) — 1 when (xi.x.i/xn) is integer. When, (xi.x.i/xn) is not an integer, Nf 
equal to either [xi.x.i/xn] — 1 or [xi.x.i/xn], which attains the maximum value of the 
prohle likelihood Lf (N), where [u] denotes the greatest integer less than or equal to u, for 
u G M. Thus, in general, Nf = [xi.x.i/xn] — 1 or [xi.x.i/xn] and Nf is finite iff xn > 0. 

□ 


Proof of Result\^ 

According to parametrization 9 = N and 4>={pi.,p.i), it is straightforward to show that the 
log-likelihood for model Mt, 


XQ 

£\9,fi) = E ln{N — xq + i) + xifnpi, +x,ilnp.i + {N — xi,)ln{l —pi.) + (A^ — x.i)/n(l —p.i)- 

i=l 


Hence, {6,111) 



N—xi, x,i 
1 -pi. ’ p.i 


N-x,i \ 
1-p.i J 


and 


E{£f{9,i/iy,eo,iio} 


_ / Npol 

N - Npoi 

Npio 

N - Npio \ 

8o=9,iljQ=ip 1 p_^ 

1 -Pi. 

P.I 

1-P.i / 


Therefore, |/*(0, 0, ^)| 


(N—xi,){N—x,i) ’ 


Since 


i’e = and 


i\e,ip-,9,if) = ^E{ifie,iii)-eo,i/io} 

dip 


JV I N 
Pi. 1-Pi. 


0 


jv , 

P.I i-p.i 


6»o=e,'i/>o=b 


N 


Again, from Severini (2000), we have , ignoring the terms not 

depending on data. So, it is clear that ^{6,ipg)\ = \E{6,iljg]6,ijj)\. Thus, from remark 

1. L^^{6) = L^^{e) for Mt and jf^{9,ipe) = j^^}, which 

leads to the proof of this result using @ □ 


Proof of Theorem [^' 

Let us define Rf^^{N) = Lf^^{N + l)/Lf^^{N). Then we have Rf^{N) = Rf{N)x 
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where Rf {N) = Lf {N + l)/Lf {N). Now, by some algebraic 

manipulation it can be shown that ^ 'p^ ^ ^ ^11 N > 

Moreover, < xq always. So, R^^{N) > Rf (N) > 1 for all xq < N < (xi.x.i/xn)— 

1. Therefore, the maximum profile likelihood estimate Nf is always less than or equal to 
the the maximum modihed profile likelihood estimate □ 

Proof of Theorem 

From Theorem and [^we have Rf^^{N) > Rf {N)^> 1 for all N < (xi.x.i/xn) —J.. Now, 
if Lf (N) is maximum at N = N (say), then Rf (N) < 1 < Rf {N — 1) < R^^{N — 1) if 
— 1 > Xq. Sirice Rf (N) < Rf^^{N) for N > xq i.e. {xiqXqi/xh) > 1, one have to check 
whether R^^{N) > 1 or not, for different possible N. 

Now, it is clear that if iV = [xi.x.i/xn] — 1, Rf^^{N) > 1 since [xi.x.i/xn] — 1 < 
(xi.x.i/xii) — 1, therefore = [xi.x.i/xn]. 

If Ai = [xi.x.i/xii], Rf^^{N) < 1 since [xi.x.i/xn] > (xi.x.i/xn) — 1, therefore = 

[xi.x.i/xii]. 

When (xi.x.i/xii) is integer, N = (xi.x.i/xn) — 1, therefore R^^{N) < 1, hence 
NfiP = (xi x.i/xii) - 1. 

Hence, associated mle is equal to (xi.x.i/xn) — 1 if (xi.x.i/xn) is an integer; 

otherwise = [xi.x.i/xn]. All estimates are finite iff xn >0. □ 

Proof of Theorem 

Let us dehne {d/dN)logL^^{N) = T{N). We have = P{N + 1) — f3{N — xq + 1) — 

logN + {6- 3/2 - N)/N + {6- 1)/{N - xi.) + log{N - xq) + (A^ - xq + l/2)/(Ar - xq). 
After some algebraic simplihcation using the asymptotic approximation of digamma func¬ 
tion (d{N) = 0{N~^) we have, = {5 — Pj/N + (<5 — 1)/(A^ — xi.) + Ajv, where 

is positive quantity decreases to zero and equivalent to 0{N~‘^), because /3 {N) = 0{N~'^). 
Clearly, if (5 = 1, > 0, for all N > xq. When J > 1, f^{,(A^) = 0{N~f > 0, for all 

N > Xq. Therefore, L^^{N) is strictly increasing for N > xq if 5 > 1 and hence, finite 
mle, , does not exist for <5 > 1. Again if (5 < 1, then = An -\- Bn, where 

Bn = {S — 1){2N — xi,)/N{N — xi.) < 0 is increases to zero. So, there may exist some 
N, for which (.tb{N) has maxima. If Bn dominates An for all N, then maxima coincides 
with the lowest value, i.e. (xq + 1). Hence we can certainly establish that, for any (5 < 1, 
(xo + 1) < < oo- Thus, finite mle for Mtb exists only when <5 < 1. □ 

Proof of Theorem^b): 

In case of model Mtb, as Lf{N) f N toi N > xq and L^^{N) t A^ for A^ >3^0) then 
from result we can say that (1 — xq/N)^/'^ increases in N with a greater rate than the 
rate of decrement of Lf{N). Now, — xiyAt)”^"^ decreases with At for <5 < 1. 

Therefore, from result [^one can definitely say that there must exist some (5o < 1 3 V(5 < 5o, 
Lff {N) f N and hence the proof. □ 
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